{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# $$User\\ Defined\\ Metrics\\ Tutorial$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Contents\n",
    "* [1. Introduction](#1.\\-Introduction)\n",
    "* [2. Classification](#2.\\-Classification)\n",
    "* [3. Regression](#3.\\-Regression)\n",
    "* [4. Multiclassification](#4.\\-Multiclassification)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1. Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "CatBoost allows you to create and pass to model your own loss functions and metrics. To do this you should implement classes with specicial interfaces."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Interface for user defined objectives:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "class UserDefinedObjective(object):\n",
    "    def calc_ders_range(self, approxes, targets, weights):\n",
    "        # approxes, targets, weights are indexed containers of floats\n",
    "        # (containers which have only __len__ and __getitem__ defined).\n",
    "        # weights parameter can be None.\n",
    "        #\n",
    "        # To understand what these parameters mean, assume that there is\n",
    "        # a subset of your dataset that is currently being processed.\n",
    "        # approxes contains current predictions for this subset,\n",
    "        # targets contains target values you provided with the dataset.\n",
    "        #\n",
    "        # This function should return a list of pairs (der1, der2), where\n",
    "        # der1 is the first derivative of the loss function with respect\n",
    "        # to the predicted value, and der2 is the second derivative.\n",
    "        pass\n",
    "    \n",
    "class UserDefinedMultiClassObjective(object):\n",
    "    def calc_ders_multi(self, approxes, target, weight):\n",
    "        # approxes - indexed container of floats with predictions \n",
    "        #            for each dimension of single object\n",
    "        # target - contains a single expected value\n",
    "        # weight - contains weight of the object\n",
    "        #\n",
    "        # This function should return a tuple (der1, der2), where\n",
    "        # - der1 is a list-like object of first derivatives of the loss function with respect\n",
    "        # to the predicted value for each dimension.\n",
    "        # - der2 is a matrix of second derivatives.\n",
    "        pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Interface for user defined metrics:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "class UserDefinedMetric(object):\n",
    "    def is_max_optimal(self):\n",
    "        # Returns whether great values of metric are better\n",
    "        pass\n",
    "\n",
    "    def evaluate(self, approxes, target, weight):\n",
    "        # approxes is a list of indexed containers\n",
    "        # (containers with only __len__ and __getitem__ defined),\n",
    "        # one container per approx dimension.\n",
    "        # Each container contains floats.\n",
    "        # weight is a one dimensional indexed container.\n",
    "        # target is a one dimensional indexed container.\n",
    "        \n",
    "        # weight parameter can be None.\n",
    "        # Returns pair (error, weights sum)\n",
    "        pass\n",
    "    \n",
    "    def get_final_error(self, error, weight):\n",
    "        # Returns final value of metric based on error and weight\n",
    "        pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we consider examples of user defined metrics for different types of tasks. We will use the following variables:\n",
    "<center>$a$ - approx value</center>\n",
    "<center>$p$ - probability</center>\n",
    "<center>$t$ - target</center>\n",
    "<center>$w$ - weight</center>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import neccessary packages\n",
    "from catboost import CatBoostClassifier, CatBoostRegressor\n",
    "import numpy as np\n",
    "from sklearn.datasets import make_classification, make_regression\n",
    "from sklearn.model_selection import train_test_split"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2. Classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: for binary classification problems approxes are not equal to probabilities. Probabilities are calculated from approxes using sigmoid function.\n",
    "<h4><center>$p=\\frac{1}{1 + e^{-a}}=\\frac{e^a}{1 + e^a}$</center></h4>\n",
    "As an example, let's take Logloss metric which is defined by the following formula:\n",
    "<h4><center>$Logloss_i = -{w_i * (t_i * log(p_i) + (1 - t_i) * log(1 - p_i))}$</center></h4>\n",
    "<h4><center>$Logloss = \\frac{\\sum_{i=1}^{N}{Logloss_i}}{\\sum_{i=1}^{N}{w_i}}$</center></h4>\n",
    "This metric has derivative and can be used as objective. The derivatives of Logloss for single object are defined by the following formulas:\n",
    "<h4><center>$\\frac{\\delta(Logloss_i)}{\\delta(a)} = w_i * (t_i - p_i)$</center></h4>\n",
    "<h4><center>$\\frac{\\delta^2(Logloss_i)}{\\delta(a^2)} = -w_i * p_i * (1 - p_i)$</center></h4>\n",
    "Below you can see implemented Logloss objective and metric."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LoglossObjective(object):\n",
    "    def calc_ders_range(self, approxes, targets, weights):\n",
    "        assert len(approxes) == len(targets)\n",
    "        if weights is not None:\n",
    "            assert len(weights) == len(approxes)\n",
    "        \n",
    "        result = []\n",
    "        for index in range(len(targets)):\n",
    "            e = np.exp(approxes[index])\n",
    "            p = e / (1 + e)\n",
    "            der1 = targets[index] - p\n",
    "            der2 = -p * (1 - p)\n",
    "\n",
    "            if weights is not None:\n",
    "                der1 *= weights[index]\n",
    "                der2 *= weights[index]\n",
    "\n",
    "            result.append((der1, der2))\n",
    "        return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LoglossMetric(object):\n",
    "    def get_final_error(self, error, weight):\n",
    "        return error / (weight + 1e-38)\n",
    "\n",
    "    def is_max_optimal(self):\n",
    "        return False\n",
    "\n",
    "    def evaluate(self, approxes, target, weight):\n",
    "        assert len(approxes) == 1\n",
    "        assert len(target) == len(approxes[0])\n",
    "\n",
    "        approx = approxes[0]\n",
    "\n",
    "        error_sum = 0.0\n",
    "        weight_sum = 0.0\n",
    "\n",
    "        for i in range(len(approx)):\n",
    "            e = np.exp(approx[i])\n",
    "            p = e / (1 + e)\n",
    "            w = 1.0 if weight is None else weight[i]\n",
    "            weight_sum += w\n",
    "            error_sum += -w * (target[i] * np.log(p) + (1 - target[i]) * np.log(1 - p))\n",
    "\n",
    "        return error_sum, weight_sum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below there are examples of training with built-in Logloss function and our Logloss objective and metric. As we can see, the results are the same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "X, y = make_classification(n_classes=2, random_state=0)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.6900380\ttest: 0.6907175\tbest: 0.6907175 (0)\ttotal: 49.5ms\tremaining: 446ms\n",
      "1:\tlearn: 0.6866060\ttest: 0.6873479\tbest: 0.6873479 (1)\ttotal: 51.8ms\tremaining: 207ms\n",
      "2:\tlearn: 0.6835392\ttest: 0.6852325\tbest: 0.6852325 (2)\ttotal: 54.1ms\tremaining: 126ms\n",
      "3:\tlearn: 0.6804590\ttest: 0.6829075\tbest: 0.6829075 (3)\ttotal: 56.4ms\tremaining: 84.6ms\n",
      "4:\tlearn: 0.6776740\ttest: 0.6816999\tbest: 0.6816999 (4)\ttotal: 58.6ms\tremaining: 58.6ms\n",
      "5:\tlearn: 0.6749116\ttest: 0.6794533\tbest: 0.6794533 (5)\ttotal: 61.8ms\tremaining: 41.2ms\n",
      "6:\tlearn: 0.6712701\ttest: 0.6772634\tbest: 0.6772634 (6)\ttotal: 65ms\tremaining: 27.8ms\n",
      "7:\tlearn: 0.6681755\ttest: 0.6747041\tbest: 0.6747041 (7)\ttotal: 68.2ms\tremaining: 17ms\n",
      "8:\tlearn: 0.6658881\ttest: 0.6732683\tbest: 0.6732683 (8)\ttotal: 71.3ms\tremaining: 7.93ms\n",
      "9:\tlearn: 0.6633931\ttest: 0.6720979\tbest: 0.6720979 (9)\ttotal: 73.7ms\tremaining: 0us\n",
      "\n",
      "bestTest = 0.6720978617\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostClassifier at 0x7f10798d22e8>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model1 = CatBoostClassifier(iterations=10, loss_function='Logloss', eval_metric='Logloss',\n",
    "                            learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                            leaf_estimation_iterations=1, leaf_estimation_method='Gradient')\n",
    "model1.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.6900380\ttest: 0.6907175\tbest: 0.6907175 (0)\ttotal: 4.36ms\tremaining: 39.2ms\n",
      "1:\tlearn: 0.6866060\ttest: 0.6873479\tbest: 0.6873479 (1)\ttotal: 9.44ms\tremaining: 37.8ms\n",
      "2:\tlearn: 0.6835392\ttest: 0.6852325\tbest: 0.6852325 (2)\ttotal: 15.2ms\tremaining: 35.5ms\n",
      "3:\tlearn: 0.6804590\ttest: 0.6829075\tbest: 0.6829075 (3)\ttotal: 19.8ms\tremaining: 29.6ms\n",
      "4:\tlearn: 0.6776740\ttest: 0.6816999\tbest: 0.6816999 (4)\ttotal: 24.5ms\tremaining: 24.5ms\n",
      "5:\tlearn: 0.6749116\ttest: 0.6794533\tbest: 0.6794533 (5)\ttotal: 29.2ms\tremaining: 19.5ms\n",
      "6:\tlearn: 0.6712701\ttest: 0.6772634\tbest: 0.6772634 (6)\ttotal: 34.8ms\tremaining: 14.9ms\n",
      "7:\tlearn: 0.6681755\ttest: 0.6747041\tbest: 0.6747041 (7)\ttotal: 40ms\tremaining: 10ms\n",
      "8:\tlearn: 0.6658881\ttest: 0.6732683\tbest: 0.6732683 (8)\ttotal: 45.2ms\tremaining: 5.03ms\n",
      "9:\tlearn: 0.6633931\ttest: 0.6720979\tbest: 0.6720979 (9)\ttotal: 50.6ms\tremaining: 0us\n",
      "\n",
      "bestTest = 0.6720978617\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostClassifier at 0x7f10798d2048>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model2 = CatBoostClassifier(iterations=10, loss_function=LoglossObjective(), eval_metric=LoglossMetric(), \n",
    "                            learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                            leaf_estimation_iterations=1, leaf_estimation_method='Gradient')\n",
    "model2.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 3. Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For regression approxes don't need any transformations. As an example of regression loss function and metric we take well-known RMSE which is defined by the following formulas:\n",
    "<h3><center>$RMSE = \\sqrt{\\frac{\\sum_{i=1}^{N}{w_i * (t_i - a_i)^2}}{\\sum_{i=1}^{N}{w_i}}}$</center></h3>\n",
    "<h4><center>$\\frac{\\delta(RMSE_i)}{\\delta(a)} = w_i * (t_i - a_i)$</center></h4>\n",
    "<h4><center>$\\frac{\\delta^2(RMSE_i)}{\\delta(a^2)} = -w_i$</center></h4>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RmseObjective(object):\n",
    "    def calc_ders_range(self, approxes, targets, weights):\n",
    "        assert len(approxes) == len(targets)\n",
    "        if weights is not None:\n",
    "            assert len(weights) == len(approxes)\n",
    "        \n",
    "        result = []\n",
    "        for index in range(len(targets)):\n",
    "            der1 = targets[index] - approxes[index]\n",
    "            der2 = -1\n",
    "\n",
    "            if weights is not None:\n",
    "                der1 *= weights[index]\n",
    "                der2 *= weights[index]\n",
    "\n",
    "            result.append((der1, der2))\n",
    "        return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RmseMetric(object):\n",
    "    def get_final_error(self, error, weight):\n",
    "        return np.sqrt(error / (weight + 1e-38))\n",
    "\n",
    "    def is_max_optimal(self):\n",
    "        return False\n",
    "\n",
    "    def evaluate(self, approxes, target, weight):\n",
    "        assert len(approxes) == 1\n",
    "        assert len(target) == len(approxes[0])\n",
    "\n",
    "        approx = approxes[0]\n",
    "\n",
    "        error_sum = 0.0\n",
    "        weight_sum = 0.0\n",
    "\n",
    "        for i in range(len(approx)):\n",
    "            w = 1.0 if weight is None else weight[i]\n",
    "            weight_sum += w\n",
    "            error_sum += w * ((approx[i] - target[i])**2)\n",
    "\n",
    "        return error_sum, weight_sum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below there are examples of training with built-in RMSE function and our RMSE objective and metric. As we can see, the results are the same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "X, y = make_regression(random_state=0)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 128.6631656\ttest: 140.6536718\tbest: 140.6536718 (0)\ttotal: 3.86ms\tremaining: 34.8ms\n",
      "1:\tlearn: 128.0351695\ttest: 140.7369887\tbest: 140.6536718 (0)\ttotal: 8.51ms\tremaining: 34ms\n",
      "2:\tlearn: 126.7781283\ttest: 141.0444768\tbest: 140.6536718 (0)\ttotal: 11.6ms\tremaining: 27.2ms\n",
      "3:\tlearn: 125.7603646\ttest: 141.1458855\tbest: 140.6536718 (0)\ttotal: 15.9ms\tremaining: 23.8ms\n",
      "4:\tlearn: 124.6922146\ttest: 141.0856002\tbest: 140.6536718 (0)\ttotal: 18.6ms\tremaining: 18.6ms\n",
      "5:\tlearn: 123.6667350\ttest: 141.0495141\tbest: 140.6536718 (0)\ttotal: 21.1ms\tremaining: 14.1ms\n",
      "6:\tlearn: 122.7210914\ttest: 140.8511986\tbest: 140.6536718 (0)\ttotal: 23.7ms\tremaining: 10.2ms\n",
      "7:\tlearn: 121.8418528\ttest: 140.7646996\tbest: 140.6536718 (0)\ttotal: 26.3ms\tremaining: 6.58ms\n",
      "8:\tlearn: 121.0103984\ttest: 140.4834561\tbest: 140.4834561 (8)\ttotal: 28.9ms\tremaining: 3.21ms\n",
      "9:\tlearn: 119.9286951\ttest: 140.2935285\tbest: 140.2935285 (9)\ttotal: 31.5ms\tremaining: 0us\n",
      "\n",
      "bestTest = 140.2935285\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostRegressor at 0x7f10a84f9c50>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model1 = CatBoostRegressor(iterations=10, loss_function='RMSE', eval_metric='RMSE',\n",
    "                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                           leaf_estimation_iterations=1, leaf_estimation_method='Gradient')\n",
    "model1.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 128.6631656\ttest: 140.6536718\tbest: 140.6536718 (0)\ttotal: 4.01ms\tremaining: 36.1ms\n",
      "1:\tlearn: 128.0351695\ttest: 140.7369887\tbest: 140.6536718 (0)\ttotal: 6.72ms\tremaining: 26.9ms\n",
      "2:\tlearn: 126.7781283\ttest: 141.0444768\tbest: 140.6536718 (0)\ttotal: 9.52ms\tremaining: 22.2ms\n",
      "3:\tlearn: 125.7603646\ttest: 141.1458855\tbest: 140.6536718 (0)\ttotal: 12.2ms\tremaining: 18.3ms\n",
      "4:\tlearn: 124.6922146\ttest: 141.0856002\tbest: 140.6536718 (0)\ttotal: 17.5ms\tremaining: 17.5ms\n",
      "5:\tlearn: 123.6667350\ttest: 141.0495141\tbest: 140.6536718 (0)\ttotal: 20.6ms\tremaining: 13.7ms\n",
      "6:\tlearn: 122.7210914\ttest: 140.8511986\tbest: 140.6536718 (0)\ttotal: 23.4ms\tremaining: 10ms\n",
      "7:\tlearn: 121.8418528\ttest: 140.7646996\tbest: 140.6536718 (0)\ttotal: 26.4ms\tremaining: 6.59ms\n",
      "8:\tlearn: 121.0103984\ttest: 140.4834561\tbest: 140.4834561 (8)\ttotal: 30.5ms\tremaining: 3.39ms\n",
      "9:\tlearn: 119.9286951\ttest: 140.2935285\tbest: 140.2935285 (9)\ttotal: 35.2ms\tremaining: 0us\n",
      "\n",
      "bestTest = 140.2935285\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostRegressor at 0x7f1079365080>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model2 = CatBoostRegressor(iterations=10, loss_function=RmseObjective(), eval_metric=RmseMetric(),\n",
    "                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                           leaf_estimation_iterations=1, leaf_estimation_method='Gradient')\n",
    "model2.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Multiclassification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: for multiclassification problems approxes are not equal to probabilities. Usually approxes are transformed to probabilities using Softmax function.\n",
    "<h3><center>$p_{i,c} = \\frac{e^{a_{i,c}}}{\\sum_{j=1}^k{e^{a_{i,j}}}}$</center></h3>\n",
    "<center>$p_{i,c}$ - the probability that $x_i$ belongs to class $c$</center>\n",
    "<center>$k$ - number of classes</center>\n",
    "<center>$a_{i,j}$ - approx for object $x_i$ for class $j$</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's implement MultiClass objective that is defined as follows:\n",
    "<h3><center>$MultiClass_i = w_i * \\log{p_{i,t_i}}$</center></h3>\n",
    "<h3><center>$MultiClass = \\frac{\\sum_{i=1}^{N}Multiclass_i}{\\sum_{i=1}^{N}w_i}$</center></h3>\n",
    "\n",
    "<h3><center>$\\frac{\\delta(Multiclass_i)}{\\delta{a_{i,c}}} = \\begin{cases} \n",
    "w_i-\\frac{w_i*e^{a_{i,c}}}{\\sum_{j=1}^{k}e^{a_{i,j}}}, & \\mbox{if } c = t_i \\\\ \n",
    "-\\frac{w_i*e^{a_{i,c}}}{\\sum_{j=1}^{k}e^{a_{i,j}}}, & \\mbox{if } c \\neq t_i \n",
    "\\end{cases}$</center></h3>\n",
    "\n",
    "<h3><center>$\\frac{\\delta^2(Multiclass_i)}{\\delta{a_{i,c_1}}\\delta{a_{i,c_2}}} = \\begin{cases} \n",
    "\\frac{w_i*e^{2*a_{i,c_1}}}{(\\sum_{j=1}^{k}e^{a_{i,j}})^2} - \\frac{w_i*e^{a_{i, c_1}}}{\\sum_{j=1}^{k}e^{a_{i,j}}}, & \\mbox{if } c_1 = c_2 \\\\ \n",
    "\\frac{w_i*e^{a_{i,c_1}+a_{i,c_2}}}{(\\sum_{j=1}^{k}e^{a_{i,j}})^2}, & \\mbox{if } c_1 \\neq c_2 \n",
    "\\end{cases}$</center></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MultiClassObjective(object):\n",
    "    def calc_ders_multi(self, approx, target, weight):\n",
    "        approx = np.array(approx) - max(approx)\n",
    "        exp_approx = np.exp(approx)\n",
    "        exp_sum = exp_approx.sum()\n",
    "        grad = []\n",
    "        hess = []\n",
    "        for j in range(len(approx)):\n",
    "            der1 = -exp_approx[j] / exp_sum\n",
    "            if j == target:\n",
    "                der1 += 1\n",
    "            hess_row = []\n",
    "            for j2 in range(len(approx)):\n",
    "                der2 = exp_approx[j] * exp_approx[j2] / (exp_sum**2)\n",
    "                if j2 == j:\n",
    "                    der2 -= exp_approx[j] / exp_sum\n",
    "                hess_row.append(der2 * weight)\n",
    "                \n",
    "            grad.append(der1 * weight)\n",
    "            hess.append(hess_row)\n",
    "            \n",
    "        return (grad, hess)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "class AccuracyMetric(object):\n",
    "    def get_final_error(self, error, weight):\n",
    "        return error / (weight + 1e-38)\n",
    "\n",
    "    def is_max_optimal(self):\n",
    "        return True\n",
    "\n",
    "    def evaluate(self, approxes, target, weight):\n",
    "        best_class = np.argmax(approxes, axis=0)\n",
    "        \n",
    "        accuracy_sum = 0\n",
    "        weight_sum = 0 \n",
    "\n",
    "        for i in range(len(target)):\n",
    "            w = 1.0 if weight is None else weight[i]\n",
    "            weight_sum += w\n",
    "            accuracy_sum += w * (best_class[i] == target[i])\n",
    "\n",
    "        return accuracy_sum, weight_sum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below there are examples of training with built-in MultiClass function and our MultiClass objective. As we can see, the results are the same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "X, y = make_classification(n_samples=1000, n_features=50, n_informative=40, n_classes=5, random_state=0)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.3706667\ttest: 0.2400000\tbest: 0.2400000 (0)\ttotal: 22.3ms\tremaining: 201ms\n",
      "1:\tlearn: 0.4813333\ttest: 0.2760000\tbest: 0.2760000 (1)\ttotal: 35.2ms\tremaining: 141ms\n",
      "2:\tlearn: 0.5400000\ttest: 0.3120000\tbest: 0.3120000 (2)\ttotal: 46.9ms\tremaining: 109ms\n",
      "3:\tlearn: 0.6026667\ttest: 0.3040000\tbest: 0.3120000 (2)\ttotal: 59.3ms\tremaining: 88.9ms\n",
      "4:\tlearn: 0.6573333\ttest: 0.3120000\tbest: 0.3120000 (2)\ttotal: 71.4ms\tremaining: 71.4ms\n",
      "5:\tlearn: 0.6933333\ttest: 0.3360000\tbest: 0.3360000 (5)\ttotal: 83.3ms\tremaining: 55.5ms\n",
      "6:\tlearn: 0.7000000\ttest: 0.3440000\tbest: 0.3440000 (6)\ttotal: 95.4ms\tremaining: 40.9ms\n",
      "7:\tlearn: 0.7040000\ttest: 0.3520000\tbest: 0.3520000 (7)\ttotal: 107ms\tremaining: 26.9ms\n",
      "8:\tlearn: 0.7293333\ttest: 0.3720000\tbest: 0.3720000 (8)\ttotal: 120ms\tremaining: 13.3ms\n",
      "9:\tlearn: 0.7600000\ttest: 0.3960000\tbest: 0.3960000 (9)\ttotal: 132ms\tremaining: 0us\n",
      "\n",
      "bestTest = 0.396\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostClassifier at 0x7f10798d2080>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model1 = CatBoostClassifier(iterations=10, loss_function='MultiClass', eval_metric='Accuracy',\n",
    "                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                           leaf_estimation_iterations=1, leaf_estimation_method='Newton', classes_count=5)\n",
    "model1.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.3706667\ttest: 0.2520000\tbest: 0.2520000 (0)\ttotal: 217ms\tremaining: 1.95s\n",
      "1:\tlearn: 0.4813333\ttest: 0.2760000\tbest: 0.2760000 (1)\ttotal: 432ms\tremaining: 1.73s\n",
      "2:\tlearn: 0.5400000\ttest: 0.3120000\tbest: 0.3120000 (2)\ttotal: 649ms\tremaining: 1.51s\n",
      "3:\tlearn: 0.6026667\ttest: 0.3040000\tbest: 0.3120000 (2)\ttotal: 863ms\tremaining: 1.29s\n",
      "4:\tlearn: 0.6573333\ttest: 0.3120000\tbest: 0.3120000 (2)\ttotal: 1.08s\tremaining: 1.08s\n",
      "5:\tlearn: 0.6933333\ttest: 0.3360000\tbest: 0.3360000 (5)\ttotal: 1.3s\tremaining: 869ms\n",
      "6:\tlearn: 0.7000000\ttest: 0.3440000\tbest: 0.3440000 (6)\ttotal: 1.52s\tremaining: 653ms\n",
      "7:\tlearn: 0.7040000\ttest: 0.3520000\tbest: 0.3520000 (7)\ttotal: 1.75s\tremaining: 436ms\n",
      "8:\tlearn: 0.7293333\ttest: 0.3720000\tbest: 0.3720000 (8)\ttotal: 1.96s\tremaining: 218ms\n",
      "9:\tlearn: 0.7600000\ttest: 0.3960000\tbest: 0.3960000 (9)\ttotal: 2.18s\tremaining: 0us\n",
      "\n",
      "bestTest = 0.396\n",
      "bestIteration = 9\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<catboost.core.CatBoostClassifier at 0x7f10798b0be0>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model2 = CatBoostClassifier(iterations=10, loss_function=MultiClassObjective(), eval_metric=AccuracyMetric(),\n",
    "                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,\n",
    "                           leaf_estimation_iterations=1, leaf_estimation_method='Newton', classes_count=5)\n",
    "model2.fit(X_train, y_train, eval_set=(X_test, y_test))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
